Problem : A ( missing ) boosting - type convergence result for A DA B OOST . MH with factorized multi - class classifiers
نویسنده
چکیده
In (Kégl, 2014), we recently showed empirically that ADABOOST.MH is one of the best multi-class boosting algorithms when the classical one-against-all base classifiers, proposed in the seminal paper of Schapire and Singer (1999), are replaced by factorized base classifiers containing a binary classifier and a vote (or code) vector. In a slightly different setup, a similar factorization coupled with an iterative optimization of the two factors also proved to be an excellent approach (Gao and Koller, 2011). The main algorithmic advantage of our approach over the original setup of Schapire and Singer (1999) is that trees can be built in a straightforward way by using the binary classifier at inner nodes. In this open problem paper we take a step back to the basic setup of boosting generic multi-class factorized (Hamming) classifiers (so no trees), and state the classical problem of boosting-like convergence of the training error. Given a vote vector, training the classifier leads to a standard weighted binary classification problem. The main difficulty of proving the convergence is that, unlike in binary ADABOOST, the sum of the weights in this weighted binary classification problem is less than one, which means that the lower bound on the edge, coming from the weak learning condition, shrinks. To show the convergence, we need a (uniform) lower bound on the sum of the weights in this derived binary classification problem. Let the training data be D = { (x1,y1), . . . , (xn,yn) } , where xi ∈ Rd are observation vectors, and yi ∈ {±1}K are label vectors. Sometimes we will use the notion of an n × d observation matrix of X = (x1, . . . ,xn) and an n × K label matrix Y = (y1, . . . ,yn) instead of the set of pairs D.1 In multi-class classification, the single label `(x) of the observation x comes from a finite set. Without loss of generality, we will suppose that ` ∈ L = {1, . . . ,K}. The label vector y is a one-hot representation of the correct class: the `(x)th element of y will be 1 and all the other elements will be −1. To avoid confusion, from now on we will call y and ` the label and the label index of x, respectively. The goal of the ADABOOST.MH algorithm (Schapire and Singer 1999; Appendix B) is to return a vector-valued discriminant function f (T ) : Rd → RK with a small Hamming loss
منابع مشابه
A Hybrid Framework for Building an Efficient Incremental Intrusion Detection System
In this paper, a boosting-based incremental hybrid intrusion detection system is introduced. This system combines incremental misuse detection and incremental anomaly detection. We use boosting ensemble of weak classifiers to implement misuse intrusion detection system. It can identify new classes types of intrusions that do not exist in the training dataset for incremental misuse detection. As...
متن کاملSemi-Supervised Boosting for Multi-Class Classification
Most semi-supervised learning algorithms have been designed for binary classification, and are extended to multi-class classification by approaches such as one-against-the-rest. The main shortcoming of these approaches is that they are unable to exploit the fact that each example is only assigned to one class. Additional problems with extending semisupervised binary classifiers to multi-class p...
متن کاملFeature-based Malicious URL and Attack Type Detection Using Multi-class Classification
Nowadays, malicious URLs are the common threat to the businesses, social networks, net-banking etc. Existing approaches have focused on binary detection i.e. either the URL is malicious or benign. Very few literature is found which focused on the detection of malicious URLs and their attack types. Hence, it becomes necessary to know the attack type and adopt an effective countermeasure. This pa...
متن کاملFast Training of Effective Multi-class Boosting Using Coordinate Descent Optimization
We present a novel column generation based boosting method for multi-class classification. Our multi-class boosting is formulated in a single optimization problem as in [1, 2]. Different from most existing multi-class boosting methods, which use the same set of weak learners for all the classes, we train class specified weak learners (i.e., each class has a different set of weak learners). We s...
متن کاملAdapting Codes and Embeddings for Polychotomies
In this paper we consider formulations of multi-class problems based on a generalized notion of a margin and using output coding. This includes, but is not restricted to, standard multi-class SVM formulations. Differently from many previous approaches we learn the code as well as the embedding function. We illustrate how this can lead to a formulation that allows for solving a wider range of pr...
متن کامل